Disclosure risk assessment in statistical microdata protection via advanced record linkage
نویسندگان
چکیده
The performance of Statistical Disclosure Control (SDC) methods for microdata (also called masking methods) is measured in terms of the utility and the disclosure risk associated to the protected microdata set. Empirical disclosure risk assessment based on record linkage stands out as a realistic and practical disclosure risk assessment methodology which is applicable to every conceivable masking method. The intruder is assumed to know an external data set, whose records are to be linked to those in the protected data set; the percent of correctly linked record pairs is a measure of disclosure risk. This paper reviews conventional record linkage, which assumes shared variables between the external and the protected data sets, and then shows that record linkage—and thus disclosure—is still possible without shared variables.
منابع مشابه
On method-specific record linkage for risk assessment
Nowadays, the need for privacy motivates the use of methods that permit us to protect a microdata file both minimizing the disclosure risk and preserving the statistical utility. Nevertheless, research is usually focused on how data utility is preserved, and much less research effort is dedicated to the study of the tools that an intruder might use to compromise the privacy of the data or, in o...
متن کاملUsing Mahalanobis Distance-Based Record Linkage for Disclosure Risk Assessment
Distance-based record linkage (DBRL) is a common approach to empirically assessing the disclosure risk in SDC-protected microdata. Usually, the Euclidean distance is used. In this paper, we explore the potential advantages of using the Mahalanobis distance for DBRL. We illustrate our point for partially synthetic microdata and show that, in some cases, Mahalanobis DBRL can yield a very high re-...
متن کاملA CRONYM : Data without Boundaries D
Disclosure limitation methods for protecting the confidentiality ofrespondents in survey microdata often use perturbative techniques whichintroduce measurement error into the categorical identifying variables. Inaddition, the data itself will often have measurement errors commonly arisingfrom survey processes. There is a need for valid and practical ways to assess theprotect...
متن کاملA Quantitative Comparison of Disclosure Control Methods for Microdata
As described in Chapter 5, there is a plethora of statistical disclosure control (SDC) methods to protect microdata. This chapter provides guidance in choosing a particular SDC method by comparing some of the methods discussed in Chapter 5 on the basis of both information loss and disclosure risk. Information loss can be readily quantified using analytical measures (either generic or data-use-s...
متن کاملAssessing Disclosure Risk for Record Linkage
An intruder seeks to match a microdata file to an external file using a record linkage technique. The identification risk is defined as the probability that a match is correct. The nature of this probability and its estimation is explored. Some connections are made to the literature on disclosure risk based on the notion of population uniqueness.
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Statistics and Computing
دوره 13 شماره
صفحات -
تاریخ انتشار 2003